MATLAB for Machine Learning by Ciaburro Giuseppe
Author:Ciaburro, Giuseppe [Ciaburro, Giuseppe]
Language: eng
Format: azw3
Tags: COM018000 - COMPUTERS / Data Processing, COM037000 - COMPUTERS / Machine Theory, COM004000 - COMPUTERS / Intelligence (AI) and Semantics
Publisher: Packt Publishing
Published: 2017-08-28T04:00:00+00:00
Figure 5.5: Graphic description of the tree
Figure 5.5 gives us useful information on the classification of the three floral species immediately. In most cases, the building of the decision tree is aimed at predicting class labels or responses. Indeed, after creating a tree, we can easily predict responses for new data. Suppose a new combination of four data items that represent the length and width of sepals and petals of a specific class of floral species has been detected:
>> MeasNew= [5.9 3.2 1.3 0.25];
To predict the classification based on the tree named ClassTree, previously created and trained, for new data, enter this:
>> predict(ClassTree,measNew)
ans =
cell
'setosa'
The predict() function returns a vector of predicted class labels for the predictor data in the table or matrix based on the trained classification tree. In this case, there is a single prediction because the variable passed to the function contained only one record. In the case of a data matrix containing multiple observations, we would have obtained a series of results equal to the number of rows of the data matrix.
So far, we have learned to build a classification tree from our data. Now we need to test the model's performance in predicting new observations. But what tools are available to measure the tree's quality?
To begin, we can calculate the resubstitution error; this is the difference between the response training data and the predictions the tree makes of the response based on the input training data. It represents an initial estimate of the performance of the model, and it works only in one direction, in the sense a high value for the resubstitution error indicates that the predictions of the tree will not be good. On the contrary, having a low resubstitution error does not guarantee good predictions for new data, so it tells us nothing about it.
To calculate the resubstitution error, simply type:
>> resuberror = resubLoss(ClassTree)
resuberror =
0.0200
The calculated low value suggests that the tree classifies nearly all of the data correctly. To improve the measure of the predictive accuracy of our tree, we perform cross-validation of the tree. By default, cross-validation splits the training data into 10 parts at random. It trains 10 new trees, each one on nine parts of the data. It then examines the predictive accuracy of each new tree on the data not included in training that tree. As opposed to the resubstitution error, this method provides a good estimate of the predictive accuracy of the resulting tree, since it tests new trees on new data:
>> cvrtree = crossval(ClassTree)
cvrtree =
classreg.learning.partition.ClassificationPartitionedModel
CrossValidatedModel: 'Tree'
PredictorNames: {'x1' 'x2' 'x3' 'x4'}
ResponseName: 'Y'
NumObservations: 150
KFold: 10
Partition: [1×1 cvpartition]
ClassNames: {'setosa' 'versicolor' 'virginica'}
ScoreTransform: 'none'
Properties, Methods
>> cvloss = kfoldLoss(cvrtree)
cvloss =
0.0733
At first, we used the crossval() function; it performs a loss estimate using cross-validation. A cross-validated classification model was returned. A number of properties were then available in MATLAB's workspace. Later, we calculated the classification loss for observations not used for training by using the kfoldLoss() function. The low calculated value confirms the quality of the model.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8303)
Test-Driven Development with Java by Alan Mellor(6739)
Data Augmentation with Python by Duc Haba(6657)
Principles of Data Fabric by Sonia Mezzetta(6408)
Learn Blender Simulations the Right Way by Stephen Pearson(6303)
Microservices with Spring Boot 3 and Spring Cloud by Magnus Larsson(6177)
Hadoop in Practice by Alex Holmes(5961)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5809)
RPA Solution Architect's Handbook by Sachin Sahgal(5573)
Big Data Analysis with Python by Ivan Marin(5372)
The Infinite Retina by Robert Scoble Irena Cronin(5264)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(5152)
Pretrain Vision and Large Language Models in Python by Emily Webber(4337)
Infrastructure as Code for Beginners by Russ McKendrick(4099)
Functional Programming in JavaScript by Mantyla Dan(4040)
The Age of Surveillance Capitalism by Shoshana Zuboff(3959)
WordPress Plugin Development Cookbook by Yannick Lefebvre(3812)
Embracing Microservices Design by Ovais Mehboob Ahmed Khan Nabil Siddiqui and Timothy Oleson(3615)
Applied Machine Learning for Healthcare and Life Sciences Using AWS by Ujjwal Ratan(3591)
